HEAD
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Ignoring unknown parameters: xintercept
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
The question I was trying to find out is the correlation between the literacy rate of men and women in the years 2010 and 2011.
From my permutation correlation studies I see that the data in both graphs is significant. The actual findings were past the 95th percentile and therefore the correlation of the literacy rate is significant.
I can conclude that the correlation of women and men from 10 various countries is significant. The rise of their literacy rates is significant data and the correlation of 0.97 which is from our results is positive.
Is there a coorelation between the percent of money goverment spends on higher level education and the average income of a person on a global level??
Null Hypothesis: There is no correlation between the percent of money goverments spent on higher education and the average income of a person on a global level.
Alternative Hypothesis: There is correlation between the percent of money goverments spent on higher education and the average income of a person on a global level.
To explain the connection between goverment spending and GDP, the followoing analogy is helpfull. If a band were to cover 40% of your ticket then you will only pay the other 60%. However if they say they will cover $50 of your ticket. Has inflation rises, they will cover less and less of your ticket. This seems to be the same trend with goverment funding to colleges. The percent of college funding is decreasing while the income is increasing. This seems like goverments are spending a fixed amount even though thier average income per person is increasing.
The correlation between income and gov spending on college is negative and its significant occording to the permutation correlation test.The correlation of -.53 is statistically significant since it lies in the bottom 5th percentile.
mapp <- function(data){
data <- data[-1]
mean = map(data,mean)
median = map(data,median)
max = map(data,max)
min = map(data,min)
x <- as.tibble(mean)
x1 <- as.tibble(median)
x3 <- as.tibble(min)
x4 <-as.tibble(max)
x5 <- c("mean","median","min","max")
y1 <- full_join(x,x1)
y2 <- full_join(x3,x4)
yote <- full_join(y1,y2) %>% mutate(map_type = x5)
yote <- yote[c(3,2,1)]
print(yote)
}
mapp(GDP)
## # A tibble: 4 x 3
## map_type income year
## <chr> <dbl> <dbl>
## 1 mean 6393. 1920
## 2 median 1640 1920
## 3 min 247 1800
## 4 max 182000 2040
mapp(cost)
## # A tibble: 4 x 3
## map_type cost year
## <chr> <dbl> <dbl>
## 1 mean 71.7 2007.
## 2 median 30.4 2008
## 3 min 2.85 1995
## 4 max 2530 2017
mapp() uses the functions map(), inner_join(), and full_join() to make a tibble for the mean, median, max, and min for each column of a dataset with only double columns. I applied this to the two datasets I was comparing, GDP and cost(which is the goverment spending on higher education).Subquestion: What is the correlation between gender ratio in primary and secondary school and the HDI index for a country?
This contributes to answering the overall question because it incorporates the overarching theme of education and how it relates to other sectors of a country.
Null hypothesis: There is a significant correlation between the ratio of boys to girls in primary and secondary school and the HDI index of a country
Alternative hypothesis: There is not a significant correlation between the ratio of boys to girls in primary and secondary school and the HDI index of a country.
The test statistic is the correlation.
funk <- function(df, fun) {
out <- vector("double", length(df))
for (i in seq_along(df)) {
out[i] <- fun(df[[i]])
}
out
}
merged1 <- merged %>% transmute(hdi = hdi_2015, ratio = ratio_2015)
funk(merged1, sd) #find standard deviation of hdi and ratio
## [1] 0.16079952 0.07332416
hdi1 <-merged1$hdi
The function above finds the standard deviation of HDI index and ratio.
## [1] 0.479 0.764 0.786 0.827 0.743 0.939 0.893 0.824 0.579 0.795 0.796
## [12] 0.896 0.706 0.485 0.607 0.674 0.754 0.865 0.794 0.402 0.404 0.518
## [23] 0.920 0.648 0.396 0.847 0.738 0.727 0.435 0.776 0.474 0.827 0.775
## [34] 0.856 0.878 0.925 0.473 0.726 0.722 0.739 0.680 0.420 0.865 0.448
## [45] 0.895 0.897 0.769 0.926 0.579 0.866 0.754 0.640 0.625 0.836 0.921
## [56] 0.624 0.689 0.774 0.923 0.899 0.887 0.903 0.794 0.800 0.664 0.586
## [67] 0.830 0.763 0.497 0.427 0.912 0.848 0.898 0.748 0.512 0.476 0.789
## [78] 0.442 0.856 0.513 0.781 0.762 0.699 0.735 0.807 0.418 0.558 0.924
## [89] 0.915 0.353 0.949 0.796 0.550 0.684 0.788 0.740 0.682 0.855 0.843
## [100] 0.856 0.802 0.804 0.498 0.704 0.574 0.494 0.776 0.782 0.420 0.845
## [111] 0.890 0.666 0.901 0.418 0.884 0.722 0.490 0.725 0.541 0.913 0.939
## [122] 0.740 0.606 0.767 0.910 0.920 0.701 0.597 0.767
map_dbl(), I calculated the mean of each row of HDI index.## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Question: Is there a correlation between completion of primary school and the gender ratio of mean years in school in 1999 and 2015? This question is important and interesting because this will show the how gender equality differs by country, and help us understand why this is happening.
Null hypotheises: There is not a significant correlation between gender ratio of mean years in school and primary school completion, there is not difference by country.
Alternative hypotheises: There is a significant correlation between gender ratio of mean years in school and primary school completion, there is a difference by country.
## 1999.x 2015.x 1999.y 2015.y
## 16.21304 14.48277 23.45085 12.41970
Findings: From my correlation permutation tests I found that for both years there is a significant correlation bewteen gender ratio of the number years attending school up to university and completion of primary school. The correlation for 1999 was 0.8 and 2015 was 0.6. This shows that there was a stronger positive correlation amoungst other countires 1999 than in 2015. This makes sense because in poorer countries there is still more gender inequality compared to more develpoed countries. Also, since both correlations are past the 95th percentile we can say both are statisitcally significant and lables to matter.
Description: My function uses map_dbl to show the standard deviation of a data frame for each column.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
relevant_stats <- function(df, stat_type){
if (stat_type == "spread") {
type1 <- "Standard Deviation(s)"
output1 <- map_dbl(df, sd)
type2 <- "Inter-Quartile Range(s)"
output2 <- map_dbl(df, IQR)
list(type1, output1, type2, output2)
} else if (stat_type == "average") {
type1 <- "Median(s)"
output1 <- map_dbl(df, median)
type2 <- "Mean(s)"
output2 <- map_dbl(df, mean)
list(output1, output2)
} else if (stat_type == "extreme") {
type1 <- "Max(s)"
output1 <- map_dbl(df, max)
type2 <- "Min(s)"
output2 <- map_dbl(df, min)
list(type1, output1, type2, output2)
} else {
output1 <- "Please Enter a Known Stat Type (spread, average, extreme)"
output1
}
}
## [[1]]
## [1] "Max(s)"
##
## [[2]]
## math_scores_2007 outofschool_2007 pop
## 5.700000e+02 1.040000e+06 2.235470e+08
## out_of_school_perc
## 4.546778e-02
##
## [[3]]
## [1] "Min(s)"
##
## [[4]]
## math_scores_2007 outofschool_2007 pop
## 3.070000e+02 3.710000e+02 7.884570e+05
## out_of_school_perc
## 1.529796e-04
Subquestion: Is there a correlation between the proportion of children in a given country that don’t go to school and the average 8th grade TIMSS (math achievement) score for that country?
Graph #1 Description: This scatterplot displays the raw data between proportion of children out of school and TIMSS scores for 8th graders. I used geom_point and if_else statements to specify which countries I wanted labeled (Note: I could only access total population from the given countries and not children population, so my x-axis displays the proportion of the entire country’s population that are children out of school. This should have negligable effect on the correlation.)
Conclusion / Percentile: My histogram shows that we can reject the null hypothesis as there is a clear negative correlation between proportion of a country’s children in school and 8th grade math achievement for that country. The 5th percentile correlation is a correlation constant calue of -0.298, and the actual value falls well less than this percentile at a correlation constant of -0.475.
Function / Mapping Description: My function takes a data frame and a statistic type as parameters and uses conditional statements to print various statistics of that data frame based on the statistic type. Map_dbl is used to get the relevant statistic type(s) for each column of the data frame. For example, if the user inputs “doubles” as the first parameter and “extreme” as the second parameter, my function will get the max’s and mins for every column of the all-doubles data frame. The output is a list so I can print multiple mappings.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Ignoring unknown parameters: xintercept
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Ignoring unknown parameters: xintercept
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Ignoring unknown parameters: xintercept
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
The question I was trying to find out is the correlation between the literacy rate of men and women in the years 2010 and 2011.
From my permutation correlation studies I see that the data in both graphs is significant. The actual findings were past the 95th percentile and therefore the correlation of the literacy rate is significant.
I can conclude that the correlation of women and men from 10 various countries is significant. The rise of their literacy rates is significant data and the correlation of 0.97 which is from our results is positive.
Is there a coorelation between the percent of money goverment spends on higher level education and the average income of a person on a global level??
Null Hypothesis: There is no correlation between the percent of money goverments spent on higher education and the average income of a person on a global level.
Alternative Hypothesis: There is correlation between the percent of money goverments spent on higher education and the average income of a person on a global level.
To explain the connection between goverment spending and GDP, the followoing analogy is helpfull. If a band were to cover 40% of your ticket then you will only pay the other 60%. However if they say they will cover $50 of your ticket. Has inflation rises, they will cover less and less of your ticket. This seems to be the same trend with goverment funding to colleges. The percent of college funding is decreasing while the income is increasing. This seems like goverments are spending a fixed amount even though thier average income per person is increasing.
The correlation between income and gov spending on college is negative and its significant occording to the permutation correlation test.The correlation of -.53 is statistically significant since it lies in the bottom 5th percentile.
## # A tibble: 4 x 3
## map_type income year
## <chr> <dbl> <dbl>
## 1 mean 6393. 1920
## 2 median 1640 1920
## 3 min 247 1800
## 4 max 182000 2040
## # A tibble: 4 x 3
## map_type cost year
## <chr> <dbl> <dbl>
## 1 mean 71.7 2007.
## 2 median 30.4 2008
## 3 min 2.85 1995
## 4 max 2530 2017
mapp() uses the functions map(), inner_join(), and full_join() to make a tibble for the mean, median, max, and min for each column of a dataset with only double columns. I applied this to the two datasets I was comparing, GDP and cost(which is the goverment spending on higher education).SO if a band were to conver 50% of your ticket then you will only pay the other 50%. However if they say they will cover $50 of your ticket. Has inflation rises, they will cover less and less of your ticket. This seems to be the same trend with goverment funding to colleges and income. The percent of college funding ( based on average persons income) is going down while the income is increasing. This seems like they are spending a fixed amount even though thier average income is increasing.
The correlation between income and gov spending on college is negative and its significant occording to the permutation correlation test. The coor is -.53.
## [1] 0.16079952 0.07332416
## [1] 0.479 0.764 0.786 0.827 0.743 0.939 0.893 0.824 0.579 0.795 0.796
## [12] 0.896 0.706 0.485 0.607 0.674 0.754 0.865 0.794 0.402 0.404 0.518
## [23] 0.920 0.648 0.396 0.847 0.738 0.727 0.435 0.776 0.474 0.827 0.775
## [34] 0.856 0.878 0.925 0.473 0.726 0.722 0.739 0.680 0.420 0.865 0.448
## [45] 0.895 0.897 0.769 0.926 0.579 0.866 0.754 0.640 0.625 0.836 0.921
## [56] 0.624 0.689 0.774 0.923 0.899 0.887 0.903 0.794 0.800 0.664 0.586
## [67] 0.830 0.763 0.497 0.427 0.912 0.848 0.898 0.748 0.512 0.476 0.789
## [78] 0.442 0.856 0.513 0.781 0.762 0.699 0.735 0.807 0.418 0.558 0.924
## [89] 0.915 0.353 0.949 0.796 0.550 0.684 0.788 0.740 0.682 0.855 0.843
## [100] 0.856 0.802 0.804 0.498 0.704 0.574 0.494 0.776 0.782 0.420 0.845
## [111] 0.890 0.666 0.901 0.418 0.884 0.722 0.490 0.725 0.541 0.913 0.939
## [122] 0.740 0.606 0.767 0.910 0.920 0.701 0.597 0.767
<<<<<<< Updated upstream
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
* I am answering the subquestion: what is the correlation between gender ratio in primary and secondary school and the HDI index for a country.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
* I am answering the subquestion: what is the correlation between gender ratio in primary and secondary school and the HDI index for a country.
This contributes to answering the overall question because it incorporates the overarching theme of education and how it relates to other sectors of a country.
Since the actual correlation is greater than the 95th percentile we can conclude the the data is significant and that there is a correlation